19 research outputs found

    Simulation-Based Parallel Training

    Full text link
    Numerical simulations are ubiquitous in science and engineering. Machine learning for science investigates how artificial neural architectures can learn from these simulations to speed up scientific discovery and engineering processes. Most of these architectures are trained in a supervised manner. They require tremendous amounts of data from simulations that are slow to generate and memory greedy. In this article, we present our ongoing work to design a training framework that alleviates those bottlenecks. It generates data in parallel with the training process. Such simultaneity induces a bias in the data available during the training. We present a strategy to mitigate this bias with a memory buffer. We test our framework on the multi-parametric Lorenz's attractor. We show the benefit of our framework compared to offline training and the success of our data bias mitigation strategy to capture the complex chaotic dynamics of the system

    The Challenges of In Situ Analysis for Multiple Simulations

    Get PDF
    International audienceIn situ analysis and visualization have mainly been applied to the output of a single large-scale simulation. However, topics involving the execution of multiple simulations in supercomputers have only received minimal attention so far. Some important examples are uncertainty quantification, data assimilation, and complex optimization. In this position article, beyond highlighting the strengths and limitations of the tools that we have developed over the past few years, we share lessons learned from using them on large-scale platforms and from interacting with end users. We then discuss the forthcoming challenges, which future in situ analysis and vi-sualization frameworks will face when dealing with the exascale execution of multiple simulations

    Training Deep Surrogate Models with Large Scale Online Learning

    Full text link
    The spatiotemporal resolution of Partial Differential Equations (PDEs) plays important roles in the mathematical description of the world's physical phenomena. In general, scientists and engineers solve PDEs numerically by the use of computationally demanding solvers. Recently, deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. This paper advocates that relying on a traditional static dataset to train these models does not allow the full benefit of the solver to be used as a data generator. It proposes an open source online training framework for deep surrogate models. The framework implements several levels of parallelism focused on simultaneously generating numerical simulations and training deep neural networks. This approach suppresses the I/O and storage bottleneck associated with disk-loaded datasets, and opens the way to training on significantly larger datasets. Experiments compare the offline and online training of four surrogate models, including state-of-the-art architectures. Results indicate that exposing deep surrogate models to more dataset diversity, up to hundreds of GB, can increase model generalization capabilities. Fully connected neural networks, Fourier Neural Operator (FNO), and Message Passing PDE Solver prediction accuracy is improved by 68%, 16% and 7%, respectively

    Unlocking Large Scale Uncertainty Quantification with In Transit Iterative Statistics

    Get PDF
    International audienceMulti-run numerical simulations using supercomputers are increasingly used by physicists and engineers for dealing with input data and model uncertainties. Most of the time, the input parameters of a simulation are modeled as random variables, then simulations are run a (possibly large) number of times with input parameters varied according to a specific design of experiments. Uncertainty quantification for numerical simulations is a hard computational problem, currently bounded by the large size of the produced results. This book chapter is about using in situ techniques to enable large scale uncertainty quantification studies. We provide a comprehensive description of Melissa, a file avoiding, adaptive, fault-tolerant, and elastic framework that computes in transit statistical quantities of interest. Melissa currently implements the on-the-fly computation of the statistics necessary for the realization of large scale uncertainty quantification studies: moment-based statistics (mean, standard deviation, higher orders), quantiles, Sobol' indices, and threshold exceedance

    Simulation-Based Parallel Training

    No full text
    International audienceNumerical simulations are ubiquitous in science and engineering. Machine learning for science investigates how artificial neural architectures can learn from these simulations to speed up scientific discovery and engineering processes. Most of these architectures are trained in a supervised manner. They require tremendous amounts of data from simulations that are slow to generate and memory greedy. In this article, we present our ongoing work to design a training framework that alleviates those bottlenecks. It generates data in parallel with the training process. Such simultaneity induces a bias in the data available during the training. We present a strategy to mitigate this bias with a memory buffer. We test our framework on the multi-parametric Lorenz's attractor. We show the benefit of our framework compared to offline training and the success of our data bias mitigation strategy to capture the complex chaotic dynamics of the system

    Simulation-Based Parallel Training

    No full text
    International audienceNumerical simulations are ubiquitous in science and engineering. Machine learning for science investigates how artificial neural architectures can learn from these simulations to speed up scientific discovery and engineering processes. Most of these architectures are trained in a supervised manner. They require tremendous amounts of data from simulations that are slow to generate and memory greedy. In this article, we present our ongoing work to design a training framework that alleviates those bottlenecks. It generates data in parallel with the training process. Such simultaneity induces a bias in the data available during the training. We present a strategy to mitigate this bias with a memory buffer. We test our framework on the multi-parametric Lorenz's attractor. We show the benefit of our framework compared to offline training and the success of our data bias mitigation strategy to capture the complex chaotic dynamics of the system

    A hybrid Reduced Basis and Machine-Learning algorithm for building Surrogate Models: a first application to electromagnetism

    No full text
    International audienceA surrogate model approximates the outputs of a Partial Differential Equations (PDEs) solver with a low computational cost. In this article, we propose a method to build learning-based surrogates in the context of parameterized PDEs, which are PDEs that depend on a set of parameters but are also temporal and spatial processes. Our contribution is a method hybridizing the Proper Orthogonal Decomposition and several Support Vector Regression machines. We present promising results on a first electromagnetic use case (a primitive single-phase transformer)

    Calibration and Spectral Reconstruction for CRISATEL: an Art Painting Multispectral Acquisition System

    No full text
    The CRISATEL multispectral acquisition system is dedicated to the digital archiving of fine art paintings. It is composed of a dynamic lighting system and of a high-resolution camera equipped with a CCD linear array, 13 interference filters and several built-in electronically controlled mechanisms. A custom calibration procedure has been designed and implemented. It allows us to select the parameters to be used for the raw image acquisition and to collect experimental data, which will be used in the post processing stage to correct the obtained multispectral images. Various techniques have been tested and compared in order to reconstruct the spectral reflectance curve of the painting surface imaged in each pixel. Realistic colour rendering under any illuminants can then be obtained from this spectral reconstruction. The results obtained with the CRISATEL acquisition system and the associated multispectral image processing are shown on two art painting examples

    A hybrid Reduced Basis and Machine-Learning algorithm for building Surrogate Models: a first application to electromagnetism

    No full text
    International audienceA surrogate model approximates the outputs of a Partial Differential Equations (PDEs) solver with a low computational cost. In this article, we propose a method to build learning-based surrogates in the context of parameterized PDEs, which are PDEs that depend on a set of parameters but are also temporal and spatial processes. Our contribution is a method hybridizing the Proper Orthogonal Decomposition and several Support Vector Regression machines. We present promising results on a first electromagnetic use case (a primitive single-phase transformer)
    corecore